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Abstract 


The  problem  of  selecting  the  load  index  or  indices  to  be  used  in  dynamic  load  balancing  pol¬ 
icies  is  discussed.  One  such  index,  based  on  a  mean-value  equation,  is  proposed,  and  ite  main 
characteristics  investigated.  The  index  is  obtained  assuming  that  the  goal  of  the  load  balancing 
scheme  is  the  minimization  of  the  response  time  of  the  user  command  being  considered  for  possi¬ 
ble  remote  execution.  A  few  major  obstacles  to  the  practical  use  of  the  index  are  also  discussed. 
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1.  Introduction 

o..  o,  .h. 

servers)  accessible  on  demand  to  the  users. 

In  npil.  of  th«o  moehnnism.  Md  preinion.,  .e  cannot  Male  lint  ^ 

T  n^tc't'Lkt^^l  *rbc  nccc«ly  to  rebalance  the  loads  periodically;  that  is  we  will  have  to 

iir  rj;™ 

ever,  shorter-lived  „umalc  (or  ipn.mic)  approaches  (ll. 

vvorkload’s  frequency  spectrum,  can  only  be  ehminatca  oy  ou.p  y  v  ^  network-wide  load 

In  principle,  providing  the  tlbeif  indi'^Tdual  contributions  to  the  balancing  of  the 

reporting  command  allows  counterproductive,  due  to  the  frequency  of  the 

loads.  This  may,  however,  be  necessarily  limited  and  incomplete 

S£H:H&r,c=..rx=r 

in,  schenres  the  rest  of  the  paper  „  p,opose  in  this  paper  is  intro 

tions  to  be  made  are  presented  in  *  »  knirp  Section  5  discusses  the  advantages 


S.  Dynamic  load  balancing  policies 

Si.ee  the  terminology  nsjd  i.  “ 

rtievliltsSrs  :;“j:."m:it;“''.e/pe  of  o.r  i..esti.ation.  This  will  he  done  both 
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ia  this  and  in  the  next  section. 

First  we  notice  that  the  two  terms  "load  balancing"  and  "load  sharing"  very  often  appear 
in  the  literature  with  the  same  meaning.  One  could  easily  mtroduce  a  distinction  between  them 
K  H  on  th/different  meanings  the  terms  "balancing"  and  "sharing"  suggest:  for  instance,  use  of 
JhTterm  "balaS  to  ‘“ose  schemes  whose  objective  is  to  keep  the  lo^s  on 

the  machines  within  a  relatively  narrow  band  around  the  instantaneous  averse,  where^  s  ar 
in*"  could  refer  to  those  schemes  in  which  a  machine  sends  some  of  its  ‘oa^  (or  accepts 

some  of  the  load  of  other  machines)  only  when  its  load  goes  beyond  an  upper  threshold  (or  fal  s 
below  a  lower  threshold).  In  both  types  of  schemes  the  decision-maker  must  know  the  load  exisU 
in*  on  the  machine  being  considered;  in  the  load  balancing  ones,  also  the  current  average  system 
\old  I2l  must  be  known.  Note  that,  however,  though  drawing  a  distinction  between  bidancing 
an^  "sharing”  may  be  useful  in  certain  contexts,  both  types  of  schemes  are  dealt  with  in  the  s^e 
wav  in  this  paper^  We  therefore  use  "load  balancing”  as  a  generic  term  encomp^sing  both,  even 
Z  h  our  pTmIry  objective  in  selecting  a  load  index  is  not  that  of  equaling  the  loads  (we  shall 
indeed  see  that,  with  our  approach,  this  objective  would  be  meaningless). 

It  is  also  useful  to  distinguish  preemptive  from  non-preemptive  load 
the  former  a  running  process  may  be  suspended  and  migrated  to  a  remote  machine.  ‘^s 

execution  wiU  resume  from  the  point  of  suspension.  A  non-preemptive  scheme  is  one  “  ^ich 
process  is  assigned  to  a  machine  before  beginning  ite  execution,  and  cannot  be  moved  ‘o 
after  its  execution  has  begun.  We  shall  usually  refer  to  non-preemptive  schemes  in  the  sequel, 
Zg^mit  of  o“ur  considerations  apply  to  the  preemptive  ones  as  well  In  a  non-preemptive 
scheme  the  local  machine  is  the  machine  at  which  a  given  process  entered  the  system. 

Another  cl.sification  of  lo^  balancio,  f  Zn^a^i^d 

the  local  m^hine  takes  the  init  ative  wne  |3]^  instead,  underloaded 

‘p^o^ec^  .bo«.  .h.i,  ..npble  «.«  » 

as  to  attract  currently  running  or  soon-to-arrive  new  processes. 

Thus  the  initiator  wUl  have  to  select  senders  and  receivers  in  the  ’ 

of  the  elieible  receivers  in  sender-initiated  schemes,  and  one  of  the  eligible  senders  in 
Ziver-initiated  schemes.  This  selection  can  be  either  load-independent  or  load-dependen 

Vl"  p'robabilirtic  policy  which  kIccU  de.ti.«tioM  or  forces  irccordio*  to  probobrh- 

ties  proportional  to  their  processing  speeds. 

Exomplcc  of  (..J-dcycndcnl  policic  include  Lowest  Lorol  16,7, 8),  which  .electe  the  mnehme 

‘p:.;:ro:;v.e,««dn.„..e.^ 

tuch  t  SbSy  nr^ine  invite,  o,  poU.  the  other.,  .o  thnt  only  tho„  P'“r‘7' 

Which  a  ng  y  threshold  will  ship  processes  to  the  mitiating  machine  (note  that, 

who«  loud,  nre  obtain  the  Hijhent  Lond  policy,  the 

tre^Si".^  nni"  T 

heavilv  to  the  less  heavily  loaded  machine  in  dynamically  defined  pairs  of  machines,  and  p 
?cy  utd  in  the  MOS  operatin,  .ptem  llOj,  which  eyclieaUy  «lecU  mwrhme.  amons  the  lightly 

loaded  ones  in  a  subset  of  the  eligible  machines. 
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While  in  l<««lHlepe.dent  policie.  the  lo«i  of  e«h  m«hine  b  the  eyeten.  is  W  be  measured, 
and  its  value  known  by  at  least  some  of  the  ptasible  decision-makers,  load-indejmndent  fmlicies  do 
rive  such  a  requirement.  However,  as  was  ohserved  «  the  begiunius  of  this  action,  at  the 
very  least  a  poUcy  must  rely  on  the  knowledge  of  the  load  on  the  local  ‘ 

processes  is  not  a  ter^ost  operation  and  should  be  done  only  Whit  Mef  (o 

each  machine  must  be  measured,  i.e.,  quantitatively  eapressed,  in  all  cases.  What  mdex  (or 

indices)  should  be  used  to  measure  a  machine’s  load? 


a.  A  criterion  for  load  Index  selection 

Many  indices  have  been  explicitly  or  implicitly  used  in  the  load  balancing  literatye  to 
express  the  load  existing  on  a  machine  at  a  given  time.  Examples  of  such  load  indices  include  the 
utLation  of  the  CPU,  the  length  of  the  ready  queue  (in  UNEXt  terminology,  the  load  j’ 

The  stletch  factor  (defined  as  the  ratio  between  the  execution  time  o  a  process  on  a 

hiitp  and  its  execution  time  on  the  same  machine  when  it  is  empty),  and  more  complicated 
ru^ct  ous  o'^^  HOWS,.,,  to  the  .uthor’s  kuowlsdgc,  .  scis..i.c  justificutmu 

the  ch*“rf  u  lu>5  ini'"  1“  r'”  “Tb  .‘"ThuS 

altogether  and  simply  refer  to  "the  load"  as  if  a  universally  accepted  definition  of 

Sc.  known  to,  .  iSg  time.  Is  any  of  the  Indies  used  In  Ih.  literature  a  correct  one?  Which? 

And  what  does  it  mean  for  a  load  index  to  be  "correct  ?  . 

To  simplify  our  discussion,  we  shall  assume  that  the  object  to  which  load  balancing  applies 

is  the  .ntoruettue  user  command,  as  represented  by  the  typing  in  of  a  command 

In  nther  words  the  execution  of  a  command  will  be  considered  atomic  from  the  load 
balancing  viewpoint,  even  when  a  command  causes  the  creation  and  execution  of  s^eral 
Jht  in  prlicipl,  could  be  executed  on  different  machines.  This  assumption  is  made  to  facilitate 
the  description,  but  is  not  essential  for  the  application,  of  our  approach. 

An  assumption  that  is  essential  is  the  choice  of  the  command’s  response  time  as 

throughput  maximization  are  unknown,  but  (we  believe)  not  unreasonable. 

Under  these  assumptions,  a  correct  load  index  li  must  be  such  that  the  relationship  between 
the  relonL  t  of  a  command  and  the  index  is  represented  by  a 

the  response  ii  function  of  li  The  reason  for  this  condition  is  obvious:  if  the  function 

fXuu  X.  X ”o.VdurI.iou.  nud  .  glvun  courmund  I.  known,  (b.  vbuu  of  ,1  n. 

.  ^  •  •  d  Ku,  mwrlp  CM  be  used  to  predict  the  response  time  that  the  command  will 

htleTit^vilTSTert  to  that  machine.  The  predicted  response  times  for 

d-  1  ri  n«  tlip  local  onel  adjusted  for  the  expected  communications  delays  due  to  the  shipment  o 

XXtiblX"Nob  Ihrth'l.  condlllon.  .vnn  thongh  It  to  b.  not  v..,  ,«trict.v., 

shortest  possible  _  proposed  in  the  literature.  Simple  experiments  per- 

x*  by'i’’  :ior^bS.  x  tU;  r.'irv'r. „  t.  cpu  .tiu..tio.  ,.«iy  ,.™. 

Lngth,  .tretch  fnetor.  nnd  othnr  Indket  K.  mnitl-vjnvd  on.,  for  nt  leMt  K.me  typos  of  oom- 

mands.  ,  ,  .  .  , 

If  more  restrictive  conditions  are  imposed,  the  selection  of  the  least  loaded  machine  can  ^ 
made  mre  efficient.  For  instance,  in  a  system  consisting  of  identical  machines  a 
no^-decreasing  rf(/0  function  would  allow  the  decision-maker  to  restrict  the  choice  to 


t  UNIX  it  »  tridemirk  of  AT  ft  T  Bell  Lsboratoriet 
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machinc  and  the  remote  machine  with  the  smaUest  value  of  li,  and  to  compute  the  function  only 
for  these  two  machines.  Furthermore,  if  the  monotonic  function  has  a  known  minimum  slope  or, 
even  better,  is  linear,  then  aU  comparisons  wiU  involve  only  the  values  of  It.  and  no  computation 
of  rt  will  have  to  be  performed.  These  observations  can  be  extended  to  heterogenous  systems,  but 
will  apply  only  to  each  group  of  identical  machines  within  them. 

Note  that  our  choice  of  command  response  time  minimization  as  the  objective  of  load 
balancing  makes  it  impossible  to  define  the  load  of  a  machine  as  a  command-independent  quan¬ 
tity.  This  means  that  our  answer  to  the  question:  "How  much  load  is  there  on  this  machine  will 
be  another  question:  "For  what  command?”.  Thus,  our  approach  does  not  help  when  the  policy  is 
load-independent,  unless  a  standard  "basket"  of  commands  is  defined  to  be  used  in  the  computa¬ 
tion  of  a  command-independent  load  index  for  each  machine.  Also,  our  approach  does  not  pr^ 
vide  help  in  making  command  migration  decisions  when  the  policy  is  preemptive,  again  unless  the 
idea  of  a  basket  of  commands  is  adopted. 


4.  A  load  index  baaed  on  a  mean-value  equation 

In  this  section,  we  shall  propose  a  load  index  which,  under  certain  assumptions,  satisfies  the 
criterion  introduced  in  the  previous  section.  Section  5  outlines  a  possible  implementation  of  a 
scheme  based  on  the  index,  and  discusses  some  of  the  index’s  properties  and  potential  drawbacks. 

Consider  a  machine  M,  a  command  A,  and  a  mix  of  commands  B.  Among  all  the  possible 
loads  that  M  may  be  processing,  consider  the  following: 

(A)  command  A  runs  alone  on  M; 

(B)  mix  B  (the  background  load)  runs  on  M; 

(C)  the  combination  of  A  and  J3runs  on  M. 

Our  problem  can  now  be  expressed  in  these  terms:  predict  the  response 
C  is  running  on  M  from  the  knowledge  of  A  and  of  the  background  load  B  (the  load  that  was 

there  just  before  the  arrival  of  A). 

We  make  the  assumption  that  machine  M  can  be  accurately  modeled  by  a  closed  queueing 
network  model  having; 

(i)  R  chains; 

(ii)  L  service  centers  1,  2,  ...  L  of  the  FCFS,  PS  (processor  sharing),  LCFSPR  (l^t- 
come-first-served-preemptive-resume),  and  IS  (infinite  servers)  types  [ll|;  center  1  is 
an  IS-type  service  center  representing  user  terminals; 

(iii)  a  fixed  number  of  customers  (i.e.,  commands)  in  each  chain; 

(iv)  service  rates  independent  of  the  number  of  customers  at  the  respective  centers; 

(v)  FCFS  centers  that  are  all  single-server  centers. 

The  three  loads  A,  B,  and  C  can  be  modeled  as  follows: 

(a)  command  A  is  the  only  customer  in  chain  1; 

(b)  the  commands  in  load  B  are  clustered,  and  each  cluster  is  represented  by  one  of  the 
chains  2  through  /?  (/?  is  set  equal  to  the  number  of  clusters  of  B  plus  1); 

(c)  load  C  is  represented  by  chains  1  through  R. 
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Under  these  assumptions,  the  mean-value  equation  of  Corollary  1  of  |12]  holds  for  each 
non-IS  center  /  in  such  a  model: 


w, 


,./(K)  =  +  n^K-e,)], 


where 


(1) 


K  =  population  vector  (ib,  =  population  size  of  chain  r), 
e,  =  i?-dimensional  unit  vector  in  direction  r, 

w,  I  =  mean  time  spent  by  a  chain  r  customer  at  center  /  at  each  visit, 

j  =  mean  service  time  per  visit  of  a  chain  r  customer  at  center  /, 

n((K-e, )=  mean  number  of  customers  (mean  queue  length)  at  center  /  in  the  same 
queueing  network  with  one  less  customer  in  chain  r 


Note  that,  for  an  IS  center,  we  have 


w,,t  =  r„/- 


(2) 


If  the  model  includes  IS  centers  other  than  center  1,  the  corresponding  n,  will  be  defined  to  be  0. 

Denoting  by  r<,(A')  the  mean  response  time  (i.e.,  the  mean  time  spent  outside  service  center 
1)  of  a  chain  r  command  under  load  X,  and  by  u,,,  the  mean  number  of  visits  a  chain  r  customer 
makes  to  service  center  /,  we  can  write  for  command  A 


and  for  command  0 


rti(A)=  E  ^ 

1-2 


1-2 


(3) 


(4) 


Substituting  (1)  into  (4),  and  using  (3),  we  obtain 
L  L 

rti(C)  =  E  ^1,1^1, 1  ~ 

1-2  1-2 


rti(A)+ 

1-2 


(5) 


since  K  -  e,  represents  load  Cwith  one  less  customer  in  chain  1,  i.e.,  with  no  customers  in  that 
chain;  in  other  words,  it  represents  load  B. 

By  (2),  the  increase  in  command  response  time  can  be  written  as: 


Art  =  rti(C)  -  rf,(A)  =  E  t;i_,t«i,i(A)n,(B). 

1-2 


(6) 
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Thus  under  the  assumptions  made,  the  response  time  of  a  command  A  is  a  iinear  'ombma- 
f  flip  flueue  leneths  at  the  non-IS  centers  under  load  B,  the  coefficients  being  the  tota 

^es  bile  s^^t  by  A  centers  when  running  alone  on  the  same  machine.  Note 

bt  Snoloi  Xt-i  >!>.  fc.Elb  .t  .  J.0  th.  c»su,n,.,  .» 

Furth.rmo,.,  noU  .bat  IS  ca.«,.  do  bo.  .oatnbale  u.  tb.  aua.  oa  .be 

right-hand  side  of  (6). 

Eaaation  (6)  ca.  be  u.ed  b,  predict  rl,(  C).  lu  rigbbhaad  aide  is  a  1^  »d.x  satisf,,.* 
.be  coaltioa  discussed  in  Sec.iou  3.  Furtbermore,  rl  is  a  liuear  function  of  .be  index,  .bleb, 
turn  is  a  linear  function  of  the  mean  queue  lengths. 


5.  Characterlatlca  of  the  new  index 

The  load  index  introduced  in  the  previous  section  is  a  linear  combination  of  queue  lengths; 
thus  The  two  ilgreSielts  that  are  needed  to  compute  it  for  each  eligible  machine  are  the  queue 
onH  ihftif  coefficients  which  are  the  total  times  spent  by  the  command  in  the  respective 
lengths  background  load  on  the  machine.  The  question  of  what  queues  or  servers 

“Xo  be  cllsllld  does  no"^^  a  direct  answer.  Ideally,  the  servers  should  be  those  appear 
•  tp  rarndfiirt-form  closed  queueing  network  model  of  the  machine;  in  practice,  they 

:f;".  “  rbr  i^"d  b‘'  “dtsrtsr; 

tiuns  of  tbs  execution  intervals  bet.een  two  consecutive  potential  suspension  poinU  .ben  tb 

no  extra  load.  , 

Instantaneous  queue  length  measurements  are  not  difficult  to  perform.  They  could  be  gath¬ 
ered  peSly  (or  on  demLd)  and  broadcast  (or  sent  to  the  requesting  ^heth^r 

instantaneous  al  command  would  be  another  vector,  with  com- 

geneous  networks  is  a  major  advantage  of  the  load  index  mtroduced  in  this  paper. 

The  dependence  of  the  value  of  the  index  on  the  particular  command  being  considered  is  a 
very  simple  one,  and  the  coefficients  that  characteriie  each  command  are  easy  to  measure, 
ever,  command  dependence  causes  two  problems;  „  .  *  .  *  j 

(i\  thP  absolute  load  of  a  machine  cannot  be  defined;  this  problem  can  be  alleviated,  as  noted 
in  Selin  3.  by  a  standard  workload  (a  "basket  of  commands);  in  any  case,  the 

index  only  measures  the  load  relative  to  a  given  command  or  mix  of  commands, 

fii)  the  coefficients  characterizing  a  command  generally  depend  on  the  cotnmand  s  argu- 

to  tbu  ILmJi:  cod.  UIUP  ..»«  wtcoibklc  cbuuget  iu  tb.  vuluc.  ol  tb.  .0.1I.....tu. 

Prnhlpm  fiil  is  a  very  serious  one,  and  needs  to  be  investigated,  as  its  satisfwtory  solution  is 

by  functions  with  »“”P  efficients  (or  at  least  some  of  them)  of  text  processing  and  com- 
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Another  Question  that  needs  to  be  addressed  is  the  one  concerned  with  the  realism  of  the 
machine  be  representea  oy  h  h  in  4  was  that  service  rates 

~  thU  ho..v.,,  .p..d  ..ke  ^  »d.x  .»c 

1  «  the  mean-value  equation  in  this  case  (see  Theorem  1  of  1121)  involves  me 

Even  more  fundamentally,  one  must  wonder  whether  an  equation  that  is  valid  for  a  m^el 
in  .teS^v  smte  can  be  taken  as  the  basis  for  the  definition  of  a  load  index  to  be  used  in  a  highly 
H  mi/rontext  We  have  already  encountered  this  problem  in  our  discussion  about  whether  the 

Xted.t.X  i.sU.uneo«s  ■>' 

empirical  and  simulation-based)  is  to  be  resorted  to  in  order  to  obtain  a  reliable  answer. 


0.  Concluaion 

bpiog  con.id.rrf,  th«  cogdicknu  “f  n^WoptMTd'islIibuted  .ystem 

it  i.  dlone  ^  ‘.be  eo™n.a.d’.  .,gum,..s, 

tigated  now. 
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